1,357 research outputs found

    Towards Task Understanding in Visual Settings

    Full text link
    We consider the problem of understanding real world tasks depicted in visual images. While most existing image captioning methods excel in producing natural language descriptions of visual scenes involving human tasks, there is often the need for an understanding of the exact task being undertaken rather than a literal description of the scene. We leverage insights from real world task understanding systems, and propose a framework composed of convolutional neural networks, and an external hierarchical task ontology to produce task descriptions from input images. Detailed experiments highlight the efficacy of the extracted descriptions, which could potentially find their way in many applications, including image alt text generation.Comment: Accepted as Student Abstract at 33rd AAAI Conference on Artificial Intelligence, 201

    PRESCHOOL TEACHERS’ VIEWS TOWARD HOME VISITS IN THE ACTIVITY OF FAMILY PARTICIPATION WITHIN THE PRESCHOOL CURRICULUM

    Get PDF
    The aim of this study is to determine preschool teachers’ views on home visits. The study used a semi-structured interview to collect data. The form was developed by the researcher and it consists of two sections: a demographic information section and the views on home visits of preschool teachers’ section. Thirty preschool teachers participated in this study. The results of the research suggested that the pre-school teachers thought that they needed to make home visits to children with problems, to get information about the child’s home environment, to get to know the child, and to learn about their family relations.   Article visualizations

    INFORMATION TECHNOLOGY USAGE OF ACCOUNTANTS

    Get PDF
    The purpose of this article is to investigate the reasons behind the information technology (IT) usage of accountants. On this account in the study, based on the Theory of Reasoned Action developed by Ajzen and Fishbein, attitude-subjective norms-intention and behavior relation is investigated. The effect of attitude and subjective norms towards IT usage behavior on the intention towards IT usage behavior, and the effect of intention towards IT usage behavior on IT usage are investigated. For this purpose, the data is obtained from 456 accountants via a questionnaire. As a result of the regression analysis, it may be determined that the intention towards IT usage behavior has a statistically significant impact on IT usage behavior. If the intention towards IT usage is positive, behavior is also positive. Attitude and subjective norms towards IT usage behavior also have a statistically significant impact on the intention towards IT usage behavior. If an individual’s attitude and subjective norms towards IT usage is positive, the intention towards IT usage is also positive

    Feedback driven adaptive combinatorial testing

    Get PDF
    The configuration spaces of modern software systems are too large to test exhaustively. Combinatorial interaction testing (CIT) approaches, such as covering arrays, systematically sample the configuration space and test only the selected configurations. The basic justification for CIT approaches is that they can cost-effectively exercise all system behaviors caused by the settings of t or fewer options. We conjecture, however, that in practice many such behaviors are not actually tested because of masking effects – failures that perturb execution so as to prevent some behaviors from being exercised. In this work we present a feedback-driven, adaptive, combinatorial testing approach aimed at detecting and working around masking effects. At each iteration we detect potential masking effects, heuristically isolate their likely causes, and then generate new covering arrays that allow previously masked combinations to be tested in the subsequent iteration. We empirically assess the effectiveness of the proposed approach on two large widely used open source software systems. Our results suggest that masking effects do exist and that our approach provides a promising and efficient way to work around them

    Auditing Search Engines for Differential Satisfaction Across Demographics

    Get PDF
    Many online services, such as search engines, social media platforms, and digital marketplaces, are advertised as being available to any user, regardless of their age, gender, or other demographic factors. However, there are growing concerns that these services may systematically underserve some groups of users. In this paper, we present a framework for internally auditing such services for differences in user satisfaction across demographic groups, using search engines as a case study. We first explain the pitfalls of na\"ively comparing the behavioral metrics that are commonly used to evaluate search engines. We then propose three methods for measuring latent differences in user satisfaction from observed differences in evaluation metrics. To develop these methods, we drew on ideas from the causal inference literature and the multilevel modeling literature. Our framework is broadly applicable to other online services, and provides general insight into interpreting their evaluation metrics.Comment: 8 pages Accepted at WWW 201

    Generating Query Suggestions to Support Task-Based Search

    Full text link
    We address the problem of generating query suggestions to support users in completing their underlying tasks (which motivated them to search in the first place). Given an initial query, these query suggestions should provide a coverage of possible subtasks the user might be looking for. We propose a probabilistic modeling framework that obtains keyphrases from multiple sources and generates query suggestions from these keyphrases. Using the test suites of the TREC Tasks track, we evaluate and analyze each component of our model.Comment: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '17), 201

    MultiWOZ 2.4: A Multi-Domain Task-Oriented Dialogue Dataset with Essential Annotation Corrections to Improve State Tracking Evaluation

    Get PDF
    The MultiWOZ 2.0 dataset has greatly stimulated the research of task-oriented dialogue systems. However, its state annotations contain substantial noise, which hinders a proper evaluation of model performance. To address this issue, massive efforts were devoted to correcting the annotations. Three improved versions (i.e., MultiWOZ 2.1-2.3) have then been released. Nonetheless, there are still plenty of incorrect and inconsistent annotations. This work introduces MultiWOZ 2.4, which refines the annotations in the validation set and test set of MultiWOZ 2.1. The annotations in the training set remain unchanged (same as MultiWOZ 2.1) to elicit robust and noise-resilient model training. We benchmark eight state-of-the-art dialogue state tracking models on MultiWOZ 2.4. All of them demonstrate much higher performance than on MultiWOZ 2.1

    ASSIST: Towards Label Noise-Robust Dialogue State Tracking

    Get PDF
    The MultiWOZ 2.0 dataset has greatly boosted the research on dialogue state tracking (DST). However, substantial noise has been discovered in its state annotations. Such noise brings about huge challenges for training DST models robustly. Although several refined versions, including MultiWOZ 2.1-2.4, have been published recently, there are still lots of noisy labels, especially in the training set. Besides, it is costly to rectify all the problematic annotations. In this paper, instead of improving the annotation quality further, we propose a general framework, named ASSIST (lAbel noiSe-robuSt dIalogue State Tracking), to train DST models robustly from noisy labels. ASSIST first generates pseudo labels for each sample in the training set by using an auxiliary model trained on a small clean dataset, then puts the generated pseudo labels and vanilla noisy labels together to train the primary model. We show the validity of ASSIST theoretically. Experimental results also demonstrate that ASSIST improves the joint goal accuracy of DST by up to 28.16%28.16\% on MultiWOZ 2.0 and 8.41%8.41\% on MultiWOZ 2.4, compared to using only the vanilla noisy labels

    Evaluating the Cranfield Paradigm for Conversational Search Systems

    Get PDF
    Due to the sequential and interactive nature of conversations, the application of traditional Information Retrieval (IR) methods like the Cranfield paradigm require stronger assumptions. When building a test collection for Ad Hoc search, it is fair to assume that the relevance judgments provided by an annotator correlate well with the relevance judgments perceived by an actual user of the search engine. However, when building a test collection for conversational search, we do not know if it is fair to assume that the relevance judgments provided by an annotator correlate well with the relevance judgments perceived by an actual user of the conversational search system. In this paper, we perform a crowdsourcing study to evaluate the applicability of the Cranfield paradigm to conversational search systems. Our main aim is to understand what is the agreement in terms of user satisfaction between the users performing a search task in a conversational search system (i.e., directly assessing the system) and the users observing the search task being performed (i.e., indirectly assessing the system). The result of this study is paramount because it underpins and guides 1) the development of more realistic user models and simulators, and 2) the design of more reliable and robust evaluation measures for conversational search systems. Our results show that there is a fair agreement between direct and indirect assessments in terms of user satisfaction and that these two kinds of assessments share similar conversational patterns. Indeed, by collecting relevance assessments for each system utterance, we tested several conversational patterns that show a promising ability to predict user satisfaction
    • …
    corecore